Total derivative

In the mathematical field of differential calculus, the term total derivative has a number of closely related meanings.

Contents

Differentiation with indirect dependencies

Suppose that f is a function of two variables, x and y. Normally these variables are assumed to be independent. However, in some situations they may be dependent on each other. For example y could be a function of x, constraining the domain of f to a curve in R2. In this case the partial derivative of f with respect to x does not give the true rate of change of f with respect to changing x because changing x necessarily changes y. The total derivative takes such dependencies into account.

For example, suppose

f(x,y)=xy.

The rate of change of f with respect to x is usually the partial derivative of f with respect to x; in this case,

\frac{\partial f}{\partial x} = y.

However, if y depends on x, the partial derivative does not give the true rate of change of f as x changes because it holds y fixed.

Suppose we are constrained to the line

y=x

then

f(x,y) = f(x,x) = x^2.

In that case, the total derivative of f with respect to x is

\frac{\mathrm{d}f}{\mathrm{d}x} = 2 x.

Notice that this is not equal to the partial derivative:

\frac{\mathrm{d}f}{\mathrm{d}x} = 2 x \neq \frac{\partial f}{\partial x} = y = x.

While one can often perform substitutions to eliminate indirect dependencies, the chain rule provides for a more efficient and general technique. Suppose M(t, p1, ..., pn) is a function of time t and n variables p_i which themselves depend on time. Then, the total time derivative of M is

 {\operatorname{d}M \over \operatorname{d}t} = \frac{\operatorname d}{\operatorname d t} M \bigl(t, p_1(t), \ldots, p_n(t)\bigr).

The chain rule for differentiating a function of several variables implies that

 {\operatorname{d}M \over \operatorname{d}t}
= \frac{\partial M}{\partial t} %2B \sum_{i=1}^n \frac{\partial M}{\partial p_i}\frac{\operatorname{d}p_i}{\operatorname{d}t}
= \biggl(\frac{\partial}{\partial t} %2B \sum_{i=1}^n \frac{\operatorname{d}p_i}{\operatorname{d}t}\frac{\partial}{\partial p_i}\biggr)(M).

This expression is often used in physics for a gauge transformation of the Lagrangian, as two Lagrangians that differ only by the total time derivative of a function of time and the n generalized coordinates lead to the same equations of motion. The operator in brackets (in the final expression) is also called the total derivative operator (with respect to t).

For example, the total derivative of f(x(t), y(t)) is

\frac{\operatorname df}{\operatorname dt} = { \partial f \over \partial x}{\operatorname dx \over \operatorname dt }%2B{ \partial f \over \partial y}{\operatorname dy \over \operatorname dt }.

Here there is no ∂f / ∂t term since f itself does not depend on the independent variable t directly.

The total derivative via differentials

Differentials provide a simple way to understand the total derivative. For instance, suppose M(t,p_1,\dots,p_n) is a function of time t and n variables p_i as in the previous section. Then, the differential of M is

 \operatorname d M = \frac{\partial M}{\partial t} \operatorname d t %2B \sum_{i=1}^n \frac{\partial M}{\partial p_i}\operatorname{d}p_i.

This expression is often interpreted heuristically as a relation between infinitesimals. However, if the variables t and pj are interpreted as functions, and M(t,p_1,\dots,p_n) is interpreted to mean the composite of M with these functions, then the above expression makes perfect sense as an equality of differential 1-forms, and is immediate from the chain rule for the exterior derivative. The advantage of this point of view is that it takes into account arbitrary dependencies between the variables. For example, if p_1^2=p_2 p_3 then 2p_1\operatorname dp_1=p_3 \operatorname d p_2%2Bp_2\operatorname d p_3. In particular, if the variables pj are all functions of t, as in the previous section, then

 \operatorname d M
= \frac{\partial M}{\partial t} \operatorname d t %2B \sum_{i=1}^n \frac{\partial M}{\partial p_i}\frac{\partial p_i}{\partial t}\,\operatorname d t.

The total derivative as a linear map

Let U\subseteq \mathbb{R}^{n} be an open subset. Then a function f:U\rightarrow \mathbb{R}^m is said to be (totally) differentiable at a point p\in U, if there exists a linear map \operatorname df_p:\mathbb{R}^n \rightarrow \mathbb{R}^m (also denoted Dpf or Df(p)) such that

\lim_{x\rightarrow p}\frac{\|f(x)-f(p)-\operatorname df_p(x-p)\|}{\|x-p\|}=0.

The linear map \operatorname d f_p is called the (total) derivative or (total) differential of f at p. A function is (totally) differentiable if its total derivative exists at every point in its domain.

Note that f is differentiable if and only if each of its components f_i:U\rightarrow \mathbb{R} is differentiable. For this it is necessary, but not sufficient, that the partial derivatives of each function fj exist. However, if these partial derivatives exist and are continuous, then f is differentiable and its differential at any point is the linear map determined by the Jacobian matrix of partial derivatives at that point.

Total differential equation

A total differential equation is a differential equation expressed in terms of total derivatives. Since the exterior derivative is a natural operator, in a sense that can be given a technical meaning, such equations are intrinsic and geometric.

Application of the total differential to error estimation

In measurement, the total differential is used in estimating the error Δf of a function f based on the errors Δx, Δy, ... of the parameters x, y, .... Assuming that the interval is short enough for the change to be approximately linear:

Δf(x) = f'(x) × Δx

and that all variables are independent, then for all variables,

\Delta f = f_x \Delta x %2B f_y \Delta y %2B \cdots

This is because the derivative fx with respect to the particular parameter x gives the sensitivity of the function f to a change in x, in particular the error Δx. As they are assumed to be independent, the analysis describes the worst-case scenario. The absolute values of the component errors are used, because after simple computation, the derivative may have a negative sign. From this principle the error rules of summation, multiplication etc. are derived, e.g.:

Let f(a, b) = a × b;
Δf = faΔa + fbΔb; evaluating the derivatives
Δf = bΔa + aΔb; dividing by f, which is a × b
Δf/f = Δa/a + Δb/b

That is to say, in multiplication, the total relative error is the sum of the relative errors of the parameters.

References

External links